Objectives

Setup

Attach packages

In the setup chunk in your RMarkdown document, attach the following packages:

*Note: you may need to install these packages if you don’t already have them (recall: install.packages("packagename"))

Read in the data

The data you’ll use (to start) is within the data/sf_trees subfolder. Use the here package to read in the sf_trees.csv file.

sf_trees <- read_csv(here("data","sf_trees","sf_trees.csv"))

About the data: SF trees data are from the SF Open Data Portal. See more information from Thomas Mock and TidyTuesday here.

Check out the data using exploratory functions (e.g. View(), names(), summary(), etc.). Remember that those probably do not belong in your .Rmd code chunks (if you don’t need a record, you can either comment it out or put it in the Console).

Part 1: Wrangling & ggplot review

Example 1: Find counts of observations by legal_status & wrangle a bit:

# Way 1: group_by %>% summarize %>% n
sf_trees %>% 
  group_by(legal_status) %>% 
  summarize(tree_count = n())
## # A tibble: 10 x 2
##    legal_status                 tree_count
##    <chr>                             <int>
##  1 DPW Maintained                   141725
##  2 Landmark tree                        42
##  3 Permitted Site                    39732
##  4 Planning Code 138.1 required        971
##  5 Private                             163
##  6 Property Tree                       316
##  7 Section 143                         230
##  8 Significant Tree                   1648
##  9 Undocumented                       8106
## 10 <NA>                                 54
# Way 2: Same thing (+ a few other dplyr functions)
top_5_status <- sf_trees %>% 
  count(legal_status) %>% 
  drop_na(legal_status) %>% 
  rename(tree_count = n) %>% 
  relocate(tree_count) %>% 
  slice_max(tree_count, n = 5) %>% 
  arrange(-tree_count)

Make a graph of top 10 from above:

ggplot(data = top_5_status, aes(x = fct_reorder(legal_status, tree_count), y = tree_count)) +
  geom_col() +
  labs(y = "Tree count", x = "Legal Status") +
  coord_flip() +
  theme_minimal() 

Example 2: Only keep observations where legal status is Permitted Site and caretaker is MTA. Store as permitted_mta.

permitted_mta <- sf_trees %>% 
  filter(legal_status == "Permitted Site", caretaker == "MTA")

Example 3: Only keep Blackwood Acacia trees, then only keep columns legal_status, date, latitude and longitude. Store as blackwood_acacia.

The stringr package contains a bunch of useful functions for finding & working with strings (e.g. words). One is str_detect() to detect a specific string within in a column.

blackwood_acacia <- sf_trees %>% 
  filter(str_detect(species, "Blackwood Acacia")) %>% 
  select(legal_status, date, latitude, longitude)

# Make a little graph of locations (note R doesn't know these are spatial)
ggplot(data = blackwood_acacia, aes(x = longitude, y = latitude)) + 
  geom_point()

Example 4: Meet tidyr::separate()

Separate the species column into two separate columns: spp_scientific and spp_common

sf_trees_sep <- sf_trees %>% 
  separate(species, into = c("spp_scientific", "spp_common"), sep = " :: ")

Example 5: Meet tidyr::unite()

Yeah, it does the opposite. Unite the tree_id and legal_status columns, using a separator of “COOL” (no, you’d never actually do this…).

ex_5 <- sf_trees %>% 
  unite("id_status", tree_id:legal_status, sep = "_COOL_")

Stage, commit, pull, push to GitHub!

Part 2: Make some actual maps

You need sf and tmap successfully attached to do this part. We’ll convert lat/lon to spatial data (see that now there’s a column called geometry), then we can use geom_sf() to plot.

Step 1: Convert the lat/lon to spatial points

Use st_as_sf() to convert to spatial coordinates:

blackwood_acacia_sp <- blackwood_acacia %>% 
  drop_na(longitude, latitude) %>% 
  st_as_sf(coords = c("longitude","latitude")) # Convert to spatial coordinates

# But we need to set the coordinate reference system (CRS) so it's compatible with the street map of San Francisco we'll use as a "base layer":
st_crs(blackwood_acacia_sp) = 4326

# Then we can use `geom_sf`!

ggplot(data = blackwood_acacia_sp) +
  geom_sf(color = "darkgreen") +
  theme_minimal()

But that’s not especially useful unless we have an actual map of SF to plot this on, right?

Read in the SF shapefile (data/sf_map/tl_2017_06075_roads.shp):

sf_map <- read_sf(here("data","sf_map","tl_2017_06075_roads.shp"))

st_transform(sf_map, 4326)
## Simple feature collection with 4087 features and 4 fields
## geometry type:  LINESTRING
## dimension:      XY
## bbox:           xmin: -122.5136 ymin: 37.70813 xmax: -122.3496 ymax: 37.83213
## geographic CRS: WGS 84
## # A tibble: 4,087 x 5
##    LINEARID   FULLNAME     RTTYP MTFCC                                  geometry
##  * <chr>      <chr>        <chr> <chr>                          <LINESTRING [°]>
##  1 110498938… Hwy 101 S O… M     S1400 (-122.4041 37.74842, -122.404 37.7483, -…
##  2 110498937… Hwy 101 N o… M     S1400 (-122.4744 37.80691, -122.4746 37.80684,…
##  3 110366022… Ludlow Aly … M     S1780 (-122.4596 37.73853, -122.4596 37.73845,…
##  4 110608181… Mission Bay… M     S1400 (-122.3946 37.77082, -122.3929 37.77092,…
##  5 110366689… 25th Ave N   M     S1400 (-122.4858 37.78953, -122.4855 37.78935,…
##  6 110368970… Willard N    M     S1400 (-122.457 37.77817, -122.457 37.77812, -…
##  7 110368970… 25th Ave N   M     S1400 (-122.4858 37.78953, -122.4858 37.78952,…
##  8 110498933… Avenue N     M     S1400 (-122.3643 37.81947, -122.3638 37.82064,…
##  9 110368970… 25th Ave N   M     S1400  (-122.4854 37.78983, -122.4858 37.78953)
## 10 110367749… Mission Bay… M     S1400 (-122.3865 37.77086, -122.3878 37.77076,…
## # … with 4,077 more rows
ggplot(data = sf_map) +
  geom_sf()

Now combine them:

ggplot() +
  geom_sf(data = sf_map,
          size = 0.1,
          color = "darkgray") +
  geom_sf(data = blackwood_acacia_sp, 
          color = "red", 
          size = 0.5) +
  theme_void() +
  labs(title = "Blackwood acacias in San Francisco")

Now an interactive one!

tmap_mode("view")

tm_shape(blackwood_acacia_sp) + 
  tm_dots()

Wrap up:

Make sure you stage, commit, pull, then push back to GitHub. Done!

END Lab 1